89 research outputs found
Leveraging Large Language Models and Weak Supervision for Social Media data annotation: an evaluation using COVID-19 self-reported vaccination tweets
The COVID-19 pandemic has presented significant challenges to the healthcare
industry and society as a whole. With the rapid development of COVID-19
vaccines, social media platforms have become a popular medium for discussions
on vaccine-related topics. Identifying vaccine-related tweets and analyzing
them can provide valuable insights for public health research-ers and
policymakers. However, manual annotation of a large number of tweets is
time-consuming and expensive. In this study, we evaluate the usage of Large
Language Models, in this case GPT-4 (March 23 version), and weak supervision,
to identify COVID-19 vaccine-related tweets, with the purpose of comparing
performance against human annotators. We leveraged a manu-ally curated
gold-standard dataset and used GPT-4 to provide labels without any additional
fine-tuning or instructing, in a single-shot mode (no additional prompting)
Solar Event Tracking with Deep Regression Networks: A Proof of Concept Evaluation
With the advent of deep learning for computer vision tasks, the need for
accurately labeled data in large volumes is vital for any application. The
increasingly available large amounts of solar image data generated by the Solar
Dynamic Observatory (SDO) mission make this domain particularly interesting for
the development and testing of deep learning systems. The currently available
labeled solar data is generated by the SDO mission's Feature Finding Team's
(FFT) specialized detection modules. The major drawback of these modules is
that detection and labeling is performed with a cadence of every 4 to 12 hours,
depending on the module. Since SDO image data products are created every 10
seconds, there is a considerable gap between labeled observations and the
continuous data stream. In order to address this shortcoming, we trained a deep
regression network to track the movement of two solar phenomena: Active Region
and Coronal Hole events. To the best of our knowledge, this is the first
attempt of solar event tracking using a deep learning approach. Since it is
impossible to fully evaluate the performance of the suggested event tracks with
the original data (only partial ground truth is available), we demonstrate with
several metrics the effectiveness of our approach. With the purpose of
generating continuously labeled solar image data, we present this feasibility
analysis showing the great promise of deep regression networks for this task.Comment: 8 pages, 5 figures, this has been submitted and accepted for
publication at IEEE Big Data 2019 - SABID Worksho
Pulse of the Pandemic: Iterative Topic Filtering for Clinical Information Extraction from Social Media
The rapid evolution of the COVID-19 pandemic has underscored the need to
quickly disseminate the latest clinical knowledge during a public-health
emergency. One surprisingly effective platform for healthcare professionals
(HCPs) to share knowledge and experiences from the front lines has been social
media (for example, the "#medtwitter" community on Twitter). However,
identifying clinically-relevant content in social media without manual labeling
is a challenge because of the sheer volume of irrelevant data. We present an
unsupervised, iterative approach to mine clinically relevant information from
social media data, which begins by heuristically filtering for HCP-authored
texts and incorporates topic modeling and concept extraction with MetaMap. This
approach identifies granular topics and tweets with high clinical relevance
from a set of about 52 million COVID-19-related tweets from January to mid-June
2020. We also show that because the technique does not require manual labeling,
it can be used to identify emerging topics on a week-to-week basis. Our method
can aid in future public-health emergencies by facilitating knowledge transfer
among healthcare workers in a rapidly-changing information environment, and by
providing an efficient and unsupervised way of highlighting potential areas for
clinical research.Comment: 24 pages, 5 figures. To be published in the Journal of Biomedical
Informatic
A Large-Scale COVID-19 Twitter Chatter Dataset for Open Scientific Research-An International Collaboration
Ajuts: This work was partially supported by the National Institute of Aging through Stanford University's Stanford Aging and Ethnogeriatrics Transdisciplinary Collaborative Center (SAGE) center (award 3P30AG059307-02S1). The work on the collection of Russian tweets was performed by Elena Tutubalina and supported by the Russian Science Foundation (grant number 18-11-00284).As the COVID-19 pandemic continues to spread worldwide, an unprecedented amount of open data is being generated for medical, genetics, and epidemiological research. The unparalleled rate at which many research groups around the world are releasing data and publications on the ongoing pandemic is allowing other scientists to learn from local experiences and data generated on the front lines of the COVID-19 pandemic. However, there is a need to integrate additional data sources that map and measure the role of social dynamics of such a unique worldwide event in biomedical, biological, and epidemiological analyses. For this purpose, we present a large-scale curated dataset of over 1.12 billion tweets, growing daily, related to COVID-19 chatter generated from 1 January 2020 to 27 June 2021 at the time of writing. This data source provides a freely available additional data source for researchers worldwide to conduct a wide and diverse number of research projects, such as epidemiological analyses, emotional and mental responses to social distancing measures, the identification of sources of misinformation, stratified measurement of sentiment towards the pandemic in near real time, among many others
Preliminary results on the application of the aminoacid racemization technique in the Murcia Region (SE Iberian Peninsula) and their interest in paleoseismological research
Geochronology is a critical issue in paleoseismological research. The
aminoacid racemization technique shows important advantages respect to
more traditional dating methods; not just for the lower costs and promptness,
also because the object to analyze is relatively frequent, in this study:
terrestrial gastropods. Furthermore, the costs of the analysis are by far faster
and cheaper compared to other dating techniques. Racemization results
allow comparing the relative age from different sedimentary units from one
trench to another.Additionally, the racemization technique can also be used
as a geochronological tool, provided a calibration curve has been first
obtained for the particular climate for the area and, ideally, for a particular
genus. In this study we show the results obtained from the analysis of 40
samples of terrestrial gastropods from 7 different trenches located in the
Murcia Region (SE Spain). Making use of the D/L ratio of aspartic acid we
show the coherence found between relative stratigraphic ages and the
racemization age. Finally, we show a provisional conversion equation
between age of racemization, obtained from Torres et al. (1997) algorithm,
and the likely age of the samEl control geocronológico es una cuestión crítica en los estudios de
paleosismología. La técnica de racemización de aminoácidos ofrece
importantes ventajas respecto a otros métodos de datación, tanto en los
costes y rapidez, como en la facilidad de encontrar el objeto de análisis; en
este estudio, gasterópodos terrestres. Los resultados permiten comparar la
edad relativa entre unidades sedimentarias diferentes de unas zanjas a otras.
La técnica de racemización también es una herramienta geocronológica, si
bien es necesario primero establecer una curva de calibración para el
ambiente climático de la zona e, idealmente, para un género concreto. En
este estudio se muestran los resultados obtenidos en 40 muestras de
gasterópodos terrestres recogidas en 7 zanjas de investigación
paleosismológica en la Región de Murcia. Haciendo uso de la relación D/L
del ácido aspártico mostramos la coherencia entre las edades relativas
estratigráficas y su edad de racemización. Finalmente, proponemos una
relación provisional de conversión entre las edades de racemización
obtenidas por el algoritmo de Torres et al. (1997) para gasterópodos de la
zona central de la Península Ibérica y la edad probable de las muestra
Primeros resultados sobre la aplicación de la técnica de racemización de aminoácidos en la Región de Murcia (SE de la Península Ibérica) y su interés en estudios de paleosismología
Geochronology is a critical issue in paleoseismological research. The aminoacid racemization technique shows important advantages respect to more traditional dating methods; not just for the lower costs and promptness, also because the object to analyze is relatively frequent, in this study: terrestrial gastropods. Furthermore, the costs of the analysis are by far faster and cheaper compared to other dating techniques. Racemization results allow comparing the relative age from different sedimentary units from one trench to another.Additionally, the racemization technique can also be used as a geochronological tool, provided a calibration curve has been first obtained for the particular climate for the area and, ideally, for a particular genus. In this study we show the results obtained from the analysis of 40 samples of terrestrial gastropods from 7 different trenches located in the Murcia Region (SE Spain). Making use of the D/L ratio of aspartic acid we show the coherence found between relative stratigraphic ages and the racemization age. Finally, we show a provisional conversion equation between age of racemization, obtained from Torres et al. (1997) algorithm, and the likely age of the samples. RESUMEN: El control geocronológico es una cuestión crítica en los estudios de paleosismología. La técnica de racemización de aminoácidos ofrece importantes ventajas respecto a otros métodos de datación, tanto en los costes y rapidez, como en la facilidad de encontrar el objeto de análisis; en este estudio, gasterópodos terrestres. Los resultados permiten comparar la edad relativa entre unidades sedimentarias diferentes de unas zanjas a otras. La técnica de racemización también es una herramienta geocronológica, si bien es necesario primero establecer una curva de calibración para el ambiente climático de la zona e, idealmente, para un género concreto. En este estudio se muestran los resultados obtenidos en 40 muestras de gasterópodos terrestres recogidas en 7 zanjas de investigación paleosismológica en la Región de Murcia. Haciendo uso de la relación D/L del ácido aspártico mostramos la coherencia entre las edades relativas estratigráficas y su edad de racemización. Finalmente, proponemos una relación provisional de conversión entre las edades de racemización obtenidas por el algoritmo de Torres et al. (1997) para gasterópodos de la zona central de la Península Ibérica y la edad probable de las muestras
- …